Comparative Analysis of Machine Learning Models for PDF Malware Detection: Evaluating Different Training and Testing Criteria

نویسندگان

چکیده

The proliferation of maliciously coded documents as file transfers increase has led to a rise in sophisticated attacks. Portable Document Format (PDF) files have emerged major attack vector for malware due their adaptability and wide usage. Detecting PDF is challenging its ability include various harmful elements such embedded scripts, exploits, malicious URLs. This paper presents comparative analysis machine learning (ML) techniques, including Naive Bayes (NB), K-Nearest Neighbor (KNN), Average One Dependency Estimator (A1DE), Random Forest (RF), Support Vector Machine (SVM) detection. study utilizes dataset obtained from the Canadian Institute Cyber-security employs different testing criteria, namely percentage splitting 10-fold cross-validation. performance techniques evaluated using F1-score, precision, recall, accuracy measures. results indicate that KNN outperforms other models, achieving an 99.8599% findings highlight effectiveness ML models accurately detecting provide insights developing robust systems protect against activities.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative evaluation of machine learning-based malware detection on Android

The Android platform is known as the market leader for mobile devices, but it also has gained much attention among malware authors in recent years. The widespread of malware, a consequence of its popularity and the design features of the Android ecosystem, constitutes a major security threat currently targeted by the research community. Among all counter methods proposed in previous publication...

متن کامل

Evading Machine Learning Malware Detection

Machine learning is a popular approach to signatureless malware detection because it can generalize to never-beforeseen malware families and polymorphic strains. This has resulted in its practical use for either primary detection engines or supplementary heuristic detections by anti-malware vendors. Recent work in adversarial machine learning has shown that models are susceptible to gradient-ba...

متن کامل

Misleading Metrics: On Evaluating Machine Learning for Malware with Confidence

Malware pose a serious and challenging threat across the Internet and the need for automated learning-based approaches has become rapidly clear. Machine learning has long been acknowledged as a promising technique to identify and classify malware threats; such a powerful technique is unfortunately often seen as a black-box panacea, where little is understood and the results—especially with high...

متن کامل

Malware and Machine Learning

Malware analysts use Machine Learning to aid in the fight against the unstemmed tide of new malware encountered on a daily, even hourly, basis. The marriage of these two fields (malware and machine learning) is a match made in heaven: malware contains inherent patterns and similarities due to code and code pattern reuse bymalware authors; machine learning operates by discovering inherent patter...

متن کامل

Comparative Analysis of Machine Learning Algorithms with Optimization Purposes

The field of optimization and machine learning are increasingly interplayed and optimization in different problems leads to the use of machine learning approaches‎. ‎Machine learning algorithms work in reasonable computational time for specific classes of problems and have important role in extracting knowledge from large amount of data‎. ‎In this paper‎, ‎a methodology has been employed to opt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of cyber security

سال: 2023

ISSN: ['2579-0064', '2579-0072']

DOI: https://doi.org/10.32604/jcs.2023.042501